Evidence-Based Regularization for Neural Networks

نویسندگان

چکیده

Numerous approaches address over-fitting in neural networks: by imposing a penalty on the parameters of network (L1, L2, etc.); changing stochastically (drop-out, Gaussian noise, or transforming input data (batch normalization, etc.). In contrast, we aim to ensure that minimum amount supporting evidence is present when fitting model training data. This, at single neuron level, equivalent ensuring both sides separating hyperplane (for standard artificial neuron) have number points, noting these points need not belong same class for inner layers. We firstly benchmark results this approach Fashion-MINST dataset, comparing it various regularization techniques. Interestingly, note nudging each divide, least part, its data, resulting networks make use neuron, avoiding completely one side (which constant into next layers). To illustrate point, study prevalence saturated nodes throughout training, showing neurons are activated more frequently and earlier using approach. A direct consequence improved activation deep now easier train. This crucially important topology known priori often remains stuck suboptimal local minima. demonstrate property increasing depth (and width); most will result increasingly frequent failures (over different random seeds), whilst proposed evidence-based significantly outperforms ability train networks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Regularization for Neural Networks

Research into regularization techniques is motivated by the tendency of neural networks to to learn the specifics of the dataset it was trained on rather than learning general features that are applicable to unseen data. This is known as overfitting. The goal of any supervised machine learning task is to approximate a function that maps inputs to outputs, given a dataset of examples and labels....

متن کامل

Group sparse regularization for deep neural networks

In this paper, we consider the joint task of simultaneously optimizing (i) the weights of a deep neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i.e., feature selection). While these problems are generally dealt with separately, we present a simple regularized formulation allowing to solve all three of them in parallel, using stan...

متن کامل

Noisin: Unbiased Regularization for Recurrent Neural Networks

Recurrent neural networks (rnns) are powerful models of sequential data. They have been successfully used in domains such as text and speech. However, rnns are susceptible to overfitting; regularization is important. In this paper we develop Noisin, a new method for regularizing rnns. Noisin injects random noise into the hidden states of the rnn and then maximizes the corresponding marginal lik...

متن کامل

Regularization parameter estimation for feedforward neural networks

Under the framework of the Kullback-Leibler (KL) distance, we show that a particular case of Gaussian probability function for feedforward neural networks (NNs) reduces into the first-order Tikhonov regularizer. The smooth parameter in kernel density estimation plays the role of regularization parameter. Under some approximations, an estimation formula is derived for estimating regularization p...

متن کامل

GraphConnect: A Regularization Framework for Neural Networks

Deep neural networks have proved very successful in domains where large training sets are available, but when the number of training samples is small, their performance suffers from overfitting. Prior methods of reducing overfitting such as weight decay, Dropout and DropConnect are data-independent. This paper proposes a new method, GraphConnect, that is data-dependent, and is motivated by the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine learning and knowledge extraction

سال: 2022

ISSN: ['2504-4990']

DOI: https://doi.org/10.3390/make4040051